Experiences in Building the Let's MT! Portal on Amazon EC2
نویسنده
چکیده
In this presentation I will discuss the design and implementation of Let’s MT!, a collaborative platform for building statistical machine translation systems. The goal of this platform is to make MT technology, that has been developed in academia, accessible for professional translators, freelancers and every-day users without requiring technical skills and deep background knowledge of the approaches used in the backend of the translation engine. The main challenge in this project was the development of a robust environment that can serve a growing community and large numbers of user requests. The key for success is a distributed environment that allows a maximum of scalability and robustness. With this in mind, we developed a modular platform that can be scaled by adding new nodes to the different components of the system. We opted for a cloud-based solution based on Amazon EC2 to create a cost-efficient environment that can dynamically be adjusted to user needs and system load. In the presentation I will explain our design of the distributed resource repository, the SMT training facilities and the actual translation service. I will mention issues of data security and optimization of the training procedures in order to fit our setup and the expected usage of the system.
منابع مشابه
P-GRADE Portal: A generic workflow system to support user communities
P-GRADE portal is an open-source multi-grid portal that supports creation, execution and management of traditional and parameter study workflows on gLite and GT-2 infrastructures. However, some user communities need support for other grid types, like desktop grids or the Amazon EC2 cloud, as well. In this paper we will present the internal architecture of P-GRADE Portal as well as the implement...
متن کاملEarly Cloud Experiences with the Kepler Scientific Workflow System
With the increasing popularity of the Cloud computing, there are more and more requirements for scientific workflows to utilize Cloud resources. In this paper, we present our preliminary work and experiences on enabling the interaction between the Kepler scientific workflow system and the Amazon Elastic Compute Cloud (EC2). A set of EC2 actors and Kepler Amazon Machine Images are introduced wit...
متن کاملExperiences building Globus Genomics: a next-generation sequencing analysis service using Galaxy, Globus, and Amazon Web Services
We describe Globus Genomics, a system that we have developed for rapid analysis of large quantities of next-generation sequencing (NGS) genomic data. This system achieves a high degree of end-to-end automation that encompasses every stage of data analysis including initial data retrieval from remote sequencing centers or storage (via the Globus file transfer system); specification, configuratio...
متن کاملAlgorithmic-Based Fault Tolerance for Matrix Multiplication on Amazon EC2
Cloud computing presents a unique alternative to traditional computing approaches for many users and applications. The goals of this project were to assess the viability of the cloud for scientific computing applications, and to explore fault tolerance as a mechanism for maintaining high performance in this variable and unpredictable environment. Most previous attempts to run scientific computa...
متن کاملComparison of Cloud vs. Tape Backup Performance and Costs with Oracle Database
Current practice of backing up data is based on using backup tapes and remote locations for storing data. Nowadays, with the advent of cloud computing a new concept of database backup emerges. The paper presents the possibility of making backup copies of data in the cloud. We are mainly focused on performance and economic issues of making backups in the cloud in comparison to traditional backup...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013